[ES-1717770] Fix TIMEDOUT_STATE not recognized as error on interactive clusters by samikshya-db · Pull Request #1244 · databricks/databricks-jdbc

samikshya-db · 2026-03-02T12:16:57Z

Description

Follow-up to #1199 (ES-1717770). The previous fix covered the case where FetchResults returns an error status with sqlState=57KD0. However, the interactive cluster path was still broken.

Root cause: When using interactive clusters with enableDirectResults=true, the cluster can enforce its own server-side query timeout and return TIMEDOUT_STATE directly in directResults.operationStatus — before the client's polling loop ever starts. Because isErrorOperationState did not include TIMEDOUT_STATE, the driver:

Did not throw in checkOperationStatusForErrors
shouldContinuePolling(TIMEDOUT_STATE) returned false → polling loop never started → TimeoutHandler never fired
Fell through to executeFetchRequest → server returned an error → driver threw DatabricksHttpException instead of DatabricksTimeoutException

The same gap also affects the polling path when GetOperationStatus returns TIMEDOUT_STATE during polling.

Fix:

Add TIMEDOUT_STATE to isErrorOperationState
Throw DatabricksTimeoutException for TIMEDOUT_STATE in checkOperationStatusForErrors regardless of whether sqlState is set (interactive clusters do not always populate it)

Testing

testTimedOutStateInDirectResultsThrowsTimeoutException — Pavan's exact repro: server returns TIMEDOUT_STATE in directResults before polling starts
testTimedOutStateDuringPollingThrowsTimeoutException — server returns TIMEDOUT_STATE during polling

Additional Notes

The original ES-1717770 verification test passed on both warehouse and all-purpose cluster because setQueryTimeout(1) with a long-running query caused the server to return RUNNING_STATE first (query still in-flight), entering the polling loop where TimeoutHandler fired correctly. Pavan's repro consistently hits the other path: the cluster's own timeout fires first, returning TIMEDOUT_STATE directly, bypassing the polling loop entirely.

When using interactive clusters with enableDirectResults=true, the server can return TIMEDOUT_STATE directly in directResults.operationStatus when the cluster's own query timeout fires before the client's polling loop starts. Because TIMEDOUT_STATE was not included in isErrorOperationState, the driver silently fell through to executeFetchRequest and threw DatabricksHttpException instead of DatabricksTimeoutException. Fix isErrorOperationState to include TIMEDOUT_STATE, and update checkOperationStatusForErrors to throw DatabricksTimeoutException for TIMEDOUT_STATE regardless of whether sqlState is set, since interactive clusters do not always populate the SQL state field. Add tests covering: - TIMEDOUT_STATE in directResults (server timeout fires before polling starts) - TIMEDOUT_STATE returned during polling Signed-off-by: Samikshya Chand <samikshya.chand@databricks.com> Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>

samikshya-db changed the title ~~Fix TIMEDOUT_STATE not recognized as error on interactive clusters~~ [ES-1717770] Fix TIMEDOUT_STATE not recognized as error on interactive clusters Mar 2, 2026

Merge branch 'main' into fix/timedout-state-interactive-cluster

924c46e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ES-1717770] Fix TIMEDOUT_STATE not recognized as error on interactive clusters#1244

[ES-1717770] Fix TIMEDOUT_STATE not recognized as error on interactive clusters#1244
samikshya-db wants to merge 2 commits intodatabricks:mainfrom
samikshya-db:fix/timedout-state-interactive-cluster

samikshya-db commented Mar 2, 2026 •

edited by atlassian bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

samikshya-db commented Mar 2, 2026 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Additional Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

samikshya-db commented Mar 2, 2026 •

edited by atlassian bot

Loading